Self-Organizing-Map-Based Metamodeling for Massive Text Data Exploration
نویسندگان
چکیده
In this study, we describe the use of the self-organizing map (SOM) as a metamodeling technique to design a parallel text data exploration system. Firstly, the large textual collections are divided into various small data subsets. Based on the different subsets, different unitary SOM models, i.e., base models, are then trained for word clustering map. In this phase, different SOM models are implemented in parallel to gain greater computational efficiency. Finally, a SOM-based metamodel can be produced to formulate a text category map through learning from all base models. For illustration the proposed metamodel is applied to a massive text data collection.
منابع مشابه
Text Data Mining
Classiication is one of the central issues in any system dealing with text data. The need for eeective approaches is dramatically increased nowadays due to the advent of massive digital libraries containing free-form documents. What we are looking for are powerful methods for the exploration of such libraries whereby the discovery of similarities between groups of text documents is the overall ...
متن کاملExploration of Document Collections with Self-Organizing Maps: A Novel Approach to Similarity Representation
Classiication is one of the central issues in any system dealing with text data. The need for eeective approaches is dramatically increased nowadays due to the advent of massive digital libraries containing free-form documents. What we are looking for are powerful methods for the exploration of such libraries whereby the detection of similarities between the various text documents is the overal...
متن کاملFinding structure in text archives
With the advance and massive growth of electronic text archives, the need for tools emerges, which help to gain insight into the basic structure of the underlying digital library. We present a neural network approach for the analysis and exploration of text archives aiming at the detection and visualization of the inherent structure of the text collection. This cluster visualization technique c...
متن کاملExploration of Text Collections with Hierarchical Feature
Document classiication is one of the central issues in information retrieval research. The aim is to uncover similarities between text documents. In other words, classiication techniques are used to gain insight in the structure of the various data items contained in the text archive. In this paper we show the results from using a hierarchy of self-organizing maps to perform the text classiicat...
متن کاملExploration of Full-text Databases with Self-organizing Maps
Availability of large full-text document collections in electronic form has created a need for intelligent information retrieval techniques. Especially the expanding World Wide Web presupposes methods for systematic exploration of miscellaneous document collections. In this paper we introduce a new method, the WEBSOM, for this task. Self-Organizing Maps (SOMs) are used to represent documents on...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006